Detection and synchronization of audio transmissions using complex audio signals

ABSTRACT

Methods and systems for improved detection of audio transmissions are provided. In one embodiment, a method is provided that includes receiving an audio signal containing an audio transmission. The audio transmission may contain a predetermined portion that was initially generated based on an expected sequence of complex-valued signals. A real portion of the expected sequence of complex-valued signals may be compared to the received audio signal to identify a first portion of the received audio signal. A complex portion of the expected sequence may be compared to portions of the received audio signal near the first portion of the received audio signal to identify a second portion of the received audio signal. An arrival time of the audio transmission may be determined based on the second portion of the received audio signal.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims priority to U.S. Pat. Application No.17/493,315, filed Oct. 4, 2021, which claims priority to U.S. Pat.Application No. 16/879,333, filed on May 20, 2020, each of which areincorporated by reference in their entirety.

BACKGROUND

Data often needs to be transmitted between computing devices withoutconnecting both devices to the same computing network. For example, incertain applications, a computing network may not exist near thecomputing devices, or it may be too cumbersome (e.g., may take too long)to connect one or both of the computing devices to a nearby computingnetwork. Therefore, data may be transmitted directly from one computingdevice to another computing device.

SUMMARY

The present disclosure presents new and innovative methods and systemsfor detecting and the arrival of audio transmissions that contain data.In a first aspect, a method is provided that includes receiving an audiosignal that contains an audio transmission, the audio transmissioncontaining (i) a predetermined portion and (ii) data for transmissionusing the audio transmission, and computing a first plurality ofsimilarity measures between (i) a real portion of an expected sequenceof complex values and (ii) a first plurality of portions of the audiosignal, the first plurality of portions of the audio signal beginning ata first plurality of times. The method may further include determiningthat a first portion of the audio signal from among the first pluralityof portions of the audio signal corresponds to the largest of the firstplurality of similarity measures, the first portion of the audio signalbeginning at a first time from among the first plurality of times, andcomputing a second plurality of similarity measures between (i) the realportion of the expected sequence and an imaginary portion of theexpected sequence of complex values and (ii) a second plurality ofportions of the audio signal, the second plurality of portions of theaudio signal beginning at a second plurality of times. The method mayalso include determining that a second portion of the audio signal fromamong the second plurality of portions of the audio signal correspondsto the largest of the second plurality of similarity measures, thesecond portion of the audio signal beginning at a second time from amongthe second plurality of times and determining an arrival time of theaudio transmission based on the second time.

In a second aspect according to the first aspect, the second pluralityof times occur within a predetermined threshold time difference of thefirst time.

In a third aspect according to the first aspect, the predeterminedthreshold time difference is less than or equal to 1 millisecond.

In a fourth aspect according to the third aspect, the predetermined timedifference is less than or equal to 0.5 milliseconds.

In a fifth aspect according to any of the second through fourth aspects,the predetermined threshold time difference is determined as a number ofaudio samples.

In a sixth aspect according to the fifth aspect, the predeterminedthreshold time difference is less than or equal to 15 audio samples.

In a seventh aspect according to any of the first through sixth aspects,the real portions of the expected sequence indicate magnitudes forsamples of the expected sequence.

In an eighth aspect according to any of the first through seventhaspects, the imaginary portions of the expected sequence indicate phasedifferences for samples of the expected sequence.

In a ninth aspect according to the eighth aspect, the predeterminedportion of the audio transmission is initially transmitted to includethe real portion of the expected sequence and the phase differencesindicated by the imaginary portion of the expected sequence.

In a tenth aspect according to any of the first through ninth aspects,the first plurality of portions of the audio signal and the secondplurality of portions of the audio signal contain only real-valued audiosignals.

In an eleventh aspect according to any of the first through tenthaspects, the expected sequence of complex values is a continuous linearphase chirp signal.

In a twelfth aspect according to any of the first through eleventhaspects, the at least one of the first plurality of similarity measuresand the second plurality of similarity measures are calculated ascorrelation measures.

In a thirteenth aspect according to any of the first through twelfthaspects, the correlation measures include convolution measures.

In a fourteenth aspect, a system is provided that includes a processorand memory. The memory may store instructions which, when executed bythe processor, cause the processor to receive an audio signal thatcontains an audio transmission, the audio transmission containing (i) apredetermined portion and (ii) data for transmission using the audiotransmission and compute a first plurality of similarity measuresbetween (i) a real portion of an expected sequence of complex values and(ii) a first plurality of portions of the audio signal, the firstplurality of portions of the audio signal beginning at a first pluralityof times. The memory may also store instructions which, when executed bythe processor, cause the processor to determine that a first portion ofthe audio signal from among the first plurality of portions of the audiosignal corresponds to the largest of the first plurality of similaritymeasures, the first portion of the audio signal beginning at a firsttime from among the first plurality of times and compute a secondplurality of similarity measures between (i) the real portion of theexpected sequence and an imaginary portion of the expected sequence ofcomplex values and (ii) a second plurality of portions of the audiosignal, the second plurality of portions of the audio signal beginningat a second plurality of times. The memory may store furtherinstructions which, when executed by the processor, cause the processorto determine that a second portion of the audio signal from among thesecond plurality of portions of the audio signal corresponds to thelargest of the second plurality of similarity measures, the secondportion of the audio signal beginning at a second time from among thesecond plurality of times and determine an arrival time of the audiotransmission based on the second time.

In a fifteenth aspect according to the fourteenth aspect, the secondplurality of times occur within a predetermined threshold timedifference of the first time.

In a sixteenth aspect according to the fifteenth aspect, thepredetermined threshold time difference is determined as a number ofaudio samples.

In a seventeenth aspect according to any of the fourteenth throughsixteenth aspects, the imaginary portions of the expected sequenceindicate phase differences for samples of the expected sequence.

In an eighteenth aspect according to the seventeenth aspect, thepredetermined portion of the audio transmission is initially transmittedto include the real portion of the expected sequence and the phasedifferences indicated by the imaginary portion of the expected sequence.

In a nineteenth aspect according to any of the fourteenth througheighteenth aspects, the first plurality of portions of the audio signaland the second plurality of portions of the audio signal contain onlyreal-valued audio signals.

In a twentieth aspect according to any of the fourteenth throughnineteenth aspects, the expected sequence of complex values is acontinuous linear phase chirp signal.

The features and advantages described herein are not all-inclusive and,in particular, many additional features and advantages will be apparentto one of ordinary skill in the art in view of the figures anddescription. Moreover, it should be noted that the language used in thespecification has been principally selected for readability andinstructional purposes, and not to limit the scope of the disclosedsubject matter.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 illustrates a system according to an exemplary embodiment of thepresent disclosure.

FIG. 2 illustrates an audio transmission according to an exemplaryembodiment of the present disclosure.

FIGS. 3A-3B illustrate transmitter/receiver array according to anexemplary embodiment of the present disclosure.

FIG. 4 illustrates a scenario according to an exemplary embodiment ofthe present disclosure.

FIG. 5A illustrates a predetermined portion of an audio transmissionaccording to an exemplary embodiment of the present disclosure.

FIG. 5B illustrates plots of a real portion and an imaginary portion ofa predetermined portion of an audio transmission according to anexemplary embodiment of the present disclosure.

FIGS. 6A-6C illustrate comparisons of a received audio signal to anexpected sequence according to an exemplary embodiment of the presentdisclosure.

FIG. 6D illustrates a plot of similarity measures for different windowsof a received audio signal according to an exemplary embodiment of thepresent disclosure.

FIGS. 7A-7C illustrate a comparison of complex values of an expectedsequence to a received audio signal according to an exemplary embodimentof the present disclosure

FIG. 8 illustrates a computing device according to an exemplaryembodiment of the present disclosure.

FIG. 9 illustrates a method according to an exemplary embodiment of thepresent disclosure.

FIG. 10 illustrates a computing system according to an exemplaryembodiment of the present disclosure.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

Aspects of the present disclosure relate to synchronizing the processingof audio transmissions containing data. Further, various aspects relateto synchronizing the processing by determining a time of arrival foraudio transmissions.

Various techniques and systems exist to exchange data between computingdevices without connecting to the same communication network. Forexample, the computing devices may transmit data via directcommunication links between the devices. In particular, data may betransmitted according to one or more direct wireless communicationprotocols, such as Bluetooth ®, ZigBee ®, Z-Wave ®, Radio-FrequencyIdentification (RFID), Near Field Communication (NFC), and Wi-Fi ®(e.g., direct Wi-Fi links between the computing devices). However, eachof these protocols relies on data transmission using electromagneticwaves at various frequencies. Therefore, in certain instances (e.g.,ZigBee ®, Z-Wave ®, RFID, and NFC), computing devices may typicallyrequire specialized hardware to transmit data according to thesewireless communication protocols. In further instances (e.g., Bluetooth®, ZigBee ®, Z-Wave ®, and Wi-Fi ®), computing devices may typicallyhave to be communicatively paired in order to transmit data according tothese wireless communication protocols. Such communicative pairing canbe cumbersome and slow, reducing the likelihood that users associatedwith one or both of the computing devices will utilize the protocols totransmit data.

Therefore, there exists a need to wirelessly transmit data in a way that(i) does not require specialized hardware and (ii) does not requirecommunicative pairing prior to data transmission. One solution to thisproblem is to transmit data using audio transmissions. For example, FIG.1 illustrates a system 100 according to an exemplary embodiment of thepresent disclosure. The system 100 includes two computing devices 102,104 configured to transmit data 122, 124 using audio transmissions 114,116. In particular, each computing device 102, 104 includes atransmitter 106, 108 and a receiver 110, 112. The transmitters 106, 108may include any type of device capable of generating audio signals, suchas speakers. In certain implementations, the transmitters 106, 108 maybe implemented as a speaker built into the computing device 102, 104.For example, one or both of the computing devices may be a smart phone,tablet computer, and/or laptop with a built-in speaker that performs thefunctions of the transmitter 106, 108. In other implementations, thetransmitters 106, 108 may be implemented as a microphone external to thecomputing device 102, 104. For example, the transmitters 106, 108 may beimplemented as one or more speakers externally connected to thecomputing device 102, 104.

The receivers 110, 112 may include any type of device capable ofreceiving audio transmissions and converting the audio transmissionsinto signals (e.g., digital signals) capable of being processed by aprocessor of the computing device, such as microphones. In otherimplementations, the receivers 110, 112 may be implemented as amicrophone built into the computing device 102, 104. For example, one orboth of the computing devices may be a smartphone, tablet computer,and/or laptop with a built-in microphone that performs the functions ofthe receivers 110, 112. In other implementations, the receivers 110, 112may be implemented as a microphone external to the computing device 102,104. For example, the receivers 110, 112 may be implemented as one ormore microphones external to the computing device 102, 104 that arecommunicatively coupled to the computing device 102, 104. In certainimplementations, the transmitter 106, 108 and receiver 110, 112 may beimplemented as a single device connected to the computing device. Forexample, the transmitter 106, 108 and receiver 110, 112 may beimplemented as a single device containing at least one speaker and atleast one microphone that is communicatively coupled to the computingdevice 102, 104.

In certain implementations, one or both of the computing devices 102,104 may include multiple transmitters 106, 108 and/or multiple receivers110, 112. For example, the computing device 104 may include multipletransmitters 108 and multiple receivers 112 arranged in multiplelocations so that the computing device 104 can communicate with thecomputing device 102 in multiple locations (e.g., when the computingdevice 102 is located near at least one of the multiple transmitters 108and multiple receivers 112. In additional or alternativeimplementations, one or both of the computing devices 102, 104 mayinclude multiple transmitters 106, 108 and/or multiple receivers 110,112 in a single location. For example, the computing device 104 mayinclude multiple transmitters 108 and multiple receivers 112 located ata single location. The multiple transmitters 108 and multiple receivers112 may be arranged to improve coverage and/or signal quality in an areanear the single location. For example, the multiple transmitters 108 andmultiple receivers 112 may be arranged in an array or otherconfiguration so that other computing devices 102 receive audiotransmissions 114, 116 of similar quality regardless of their locationrelative to the transmitters 108 and receivers 112 (e.g., regardless ofthe location of the computing devices 102 within a service area of thetransmitters 108 and receivers 112).

The computing devices 102, 104 may generate audio transmissions 114, 116to transmit data 122, 124 to one another. For example, the computingdevices 102 may generate one or more audio transmissions 114 to transmitdata 122 from the computing device 102 to the computing device 104. Asanother example, the computing device 104 may generate one or more audiotransmissions 116 to transmit data 124 from the computing device 104 tothe computing device 102. In particular, the computing devices 102, 104may create one or more packets 118, 120 based on the data 122, 124(e.g., including a portion of the data 122, 124) for transmission usingthe audio transmissions 114, 116. To generate the audio transmission114, 116, the computing devices 102, 104 may modulate the packets 118,120 onto an audio carrier signal. The computing devices 102, 104 maythen transmit the audio transmission 114, 116 via the transmitter 106,108, which may then be received by the receiver 110, 112 of the othercomputing devices 102, 104. In certain instances (e.g., where the data122, 124 exceeds a predetermined threshold for the size of a packet 118,120), the data 122, 124 may be divided into multiple packets 118, 120for transmission using separate audio transmissions 114, 116.

Accordingly, by generating and transmitting audio transmissions 114, 116in this way, the computing devices 102, 104 may be able to transmit data122, 124 to one another without having to communicatively pair thecomputing devices 102, 104. Rather, a computing device 102, 104 canlisten for audio transmissions 114, 116 received via the receivers 110,112 from another computing device 102, 104 without having tocommunicatively pair with the other computing device 102, 104. Also,because these techniques can utilize conventional computer hardware likespeakers and microphones, the computing devices 102, 104 do not requirespecialized hardware to transmit the data 122, 124.

However, transmitting data by audio transmissions includes otherlimitations. In particular, audio transmissions are susceptible to typesof interference and/or distortions that are either not present or lessprevalent for data transmissions by electromagnetic signals. Forinstance, different frequencies utilized by the audio transmission mayattenuate differently, causing certain frequencies to appear larger inmagnitude when received by another computing device. Further, overlonger distances, the magnitude of the audio transmission when receivedmay decrease, reducing the signal-to-noise ratio for the received audiotransmission. Also, audio transmissions may be subjected to delay (e.g.,phase shifts and other delays) created by transmission through an audiochannel (e.g., created by the transmitters or receivers, introduced bythe environment through which the audio transmission traveled).

Therefore, there exists a need to account for these types ofinterference with audio transmissions. One solution to this problem isto detect audio transmissions with multiple receivers and to combine theaudio signals from the multiple receivers. Certain types of interference(e.g., environmental noise or disruptions) may tend to be uncorrelatedbetween different receivers (e.g., received at different magnitudes,received at different times, and/or not received at certain receivers).By contrast, the contents of the audio transmission received by themultiple receivers may be constant and therefore correlated between thedifferent receivers (e.g., received in the same sequence at slightlydifferent times). Therefore, combining the audio signals from themultiple receivers may increase the relative magnitude of the correlatedportions of the audio signals (e.g., the audio transmission) whiledecreasing the relative magnitude of the uncorrelated portions of theaudio signals (e.g., sources of interference). Accordingly, the combinedaudio signal may have a higher signal-to-noise ratio than the individualaudio signals received by the multiple receivers. However, the audiotransmission may not be received by all receivers connected to aparticular computing device. Therefore, before combining the audiosignals, the receivers that received the audio transmission may need tobe identified. Also, receivers with lower magnitudes of the audiotransmission may be more likely to include noise or other inaccuratesymbols (e.g., other audio transmissions), which may reduce thesignal-to-noise ratio of the combined audio signals. Therefore, thereceivers with the largest magnitude of the audio transmission may needto be identified prior to combining the audio signals.

FIG. 2 illustrates an audio transmission 200 according to an exemplaryembodiment of the present disclosure. The audio transmission 200 may beused to transmit data from one computing device to another computingdevice. For example, referring to FIG. 1 , the audio transmission 200may be an example implementation of the audio transmissions 114, 116generated by the computing devices 102, 104. The audio transmission 200includes multiple symbols 1-24, which may correspond to discrete timeperiods within the audio transmission 200. For example, each symbol 1-24may correspond to 2 milliseconds (ms) of the audio transmission 200. Inother examples, the symbols 1-24 may correspond to other time periodswithin the audio transmission 200 (e.g., 1 ms, 10 ms, 20 ms, 40 ms).Each symbol 1-24 may include one or more frequencies used to encodeinformation within the audio transmission 200. For example, the one ormore frequencies may be modulated in order to encode information in theaudio transmission 200 (e.g., certain frequencies may correspond tocertain pieces of information). In another example, the phases of thefrequencies may be additionally or alternatively be modulated in orderto encode information in the audio transmission 200 (e.g., certain phasedifferences from a reference signal may correspond to certain pieces ofinformation).

In particular, certain symbols 1-24 may correspond to particular typesof information within the audio transmission 200. For example, thesymbols 1-6 may correspond to a preamble 202 and symbols 7-24 maycorrespond to a payload 204. The preamble 202 may contain predeterminedfrequencies produced at predetermined points of time (e.g., according toa frequency pattern). In certain implementations, the preamble 202 mayadditionally or alternatively contain frequencies (e.g., a particularpredetermined frequency) whose phase differences are altered bypredetermined amounts at predetermined points of time (e.g., accordingto a phase difference pattern). The preamble 202 may be used to identifythe audio transmission 200 to a computing device receiving the audiotransmission 200. For example, a receiver of the computing devicereceiving audio transmissions such as the audio transmission 200 mayalso receive other types of audio data (e.g., audio data fromenvironmental noises and/or audio interference). The preamble 202 maytherefore be configured to identify audio data corresponding to theaudio transmission 200 when received by the receiver of the computingdevice. In particular, the computing device may be configured to analyzeincoming audio data from the receiver and to disregard audio data thatdoes not include the preamble 202. Upon detecting the preamble 202, thecomputing device may begin receiving and processing the audiotransmission 200. The preamble may also be used to align processing ofthe audio transmission 200 with the symbols 1-24 of the audiotransmission 200. In particular, by indicating the beginning of theaudio transmission 200, the preamble 202 may enable the computing devicereceiving the audio transmission 200 to properly align its processing ofthe audio transmission with the symbols 1-24.

The payload 204 may include the data intended for transmission, alongwith other information enabling proper processing of the data intendedfor transmission. In particular, the packets 208 may contain datadesired for transmission by the computing device generating the audiotransmission 200. For example, and referring to FIG. 1 , the packet 208may correspond to the packets 118, 120, which may contain all or part ofthe data 122, 124. The header 206 may include additional information forrelevant processing of data contained within the packet 208. Forexample, the header 206 may include routing information for a finaldestination of the data (e.g., a server external to the computing devicereceiving the audio transmission 200). The header 206 may also indicatean originating source of the data (e.g., an identifier of the computingdevice transmitting the audio transmission 200 and/or a user associatedwith the computing device transmitting the audio transmission 200).

The preamble 202 and the payload 204 may be modulated to form the audiotransmission 200 using similar encoding strategies (e.g., similarencoding frequencies and/or phase differences). Accordingly, thepreamble 202 and the payload 204 may be susceptible to similar types ofinterference (e.g., similar types of frequency-dependent attenuationand/or similar types of frequency-dependent delays). Proper extractionof the payload 204 from the audio transmission 200 may rely on properdemodulation of the payload 204 from an audio carrier signal. Therefore,to accurately receive the payload 204, the computing device receivingthe audio transmission 200 must account for the interference.

Symbols 1-24 and their configuration depicted in FIG. 2 are merelyexemplary. It should be understood that certain implementations of theaudio transmission 200 may use more or fewer symbols, and that one ormore of the preamble 202, the payload 204, the header 206, and/or thepacket 208 may use more or fewer symbols than those depicted and may bearranged in a different order or configuration within the audiotransmission 200.

FIGS. 3A-3B illustrate a transmitter/receiver array 300 according to anexemplary embodiment of the present disclosure. The transmitter/receiverarray 300 may be used to transmit and/or receive audio transmission 200.For example, the transmitter/receiver array 300 may be an exemplaryimplementation of at least one of the computing devices 102, 104. Thetransmitter/receiver array 300 includes eight receivers 302A-H and eighttransmitters 304 A-H. Each of the eight receivers 302A-H may beexemplary implementations of the receivers 110, 112. For example, theeight receivers 302A-H may be implemented as microphones. Each of theeight transmitters 304A-H may be exemplary implementations of thetransmitters 106, 108. For example, the eight transmitters 304A-H may beimplemented as speakers.

As depicted, the receivers 302A-H and the transmitters 304A-H arearranged to evenly cover a 360° area surrounding thetransmitter/receiver array 300. For example, the receivers 302A-H andtransmitters 304A-H are arranged so that there is approximately 45°between adjacent receivers 302A-H and adjacent transmitters 304A-H. Sucha configuration may enable the transmitter/receiver array 300 receiveaudio transmissions 200 from and transmit audio transmissions 200 inmultiple directions within a coverage area of the transmitter/receiverarray 300. The transmitter/receiver array 300 may be configured toreceive and transmit audio transmissions from computing devices locatedwithin the coverage area of the transmitter/receiver array 300. Forexample, FIG. 4 illustrates a scenario 400 in which a computing device402 (e.g., a mobile device) transmits audio transmissions 404 to thetransmitter/receiver array 300 and receives audio transmissions 406 fromthe transmitter/receiver array 300.

Returning to FIGS. 3A-3B, the receivers 302A-H and the transmitters304A-H may be mounted on a support body 306. The support body 306 mayallow the transmitter/receiver array 300 to be positioned and configuredwithout altering the relative orientation of the receivers 302A-H andthe transmitters 304A-H. In certain implementations, the receivers302A-H may be mounted such that the receivers 302A-H are separated fromthe transmitters 304A-H (e.g., so that the receivers 302A-H can avoidinterference from the transmitters 304A-H). For example, the receivers302A-H may be mounted on structural members 308A-D (only a subset ofwhich are depicted in FIG. 3B) that separate the receivers 302A-H fromthe transmitters 304A-H. In certain implementations, thetransmitter/receiver array 300 may be mounted on a support element, suchas the support element 310. The support element 310 may raise thetransmitter/receiver array 300 from the ground such that thetransmitter/receiver array 300 is at a height better suited to receivingand transmitting audio transmission 200 (e.g., at or between chest andwaist height for a typical individual).

It should be appreciated that additional or alternative implementationsof the transmitter/receiver array 300 are possible. For example,alternative implementations may have more or fewer transmitters and/orreceivers and/or may have larger or smaller transmitters and/orreceivers. As another example, alternative implementations may omit oneor more of the support body 306, the structural members 308A-D, and/orthe support elements 310. As yet another example, alternativeimplementations may further include a housing surrounding thetransmitters 304A-H and/or receivers 302A-H.

FIG. 5A illustrates a predetermined portion 500 of an audio transmissionaccording to an exemplary embodiment of the present disclosure. Thepredetermined portion 500 may represent a portion of an audiotransmission that is generated to include an expected sequence ofsymbols. In certain implementations, the predetermined portion 500 maybe included at the beginning of an audio transmission. For example, thepredetermined portion 500 may be an exemplary implementation of thepreamble 202 of the audio transmission 200. In additional or alternativeimplementations, the predetermined portion 500 may be included laterwithin an audio transmission. For example, the predetermined portion 500may occur in the middle of an audio transmission or at the end of anaudio transmission. In still further implementations, the predeterminedportion 500 may occur multiple times with an audio transmission. Forexample, the predetermined portion 500 may be included at the beginningof, in the middle of, and/or at the end of an audio transmission. Inadditional or alternative implementations, the predetermined portion 500may be an exemplary implementation of an expected sequence used togenerate a preamble or other predetermined portion of an audiotransmission.

The predetermined portion 500 includes a real portion 502 and animaginary portion 504. The real portion 502 may include the real valuesof the predetermined portion 500 and the imaginary portion 504 mayinclude the imaginary values of the predetermined portion 500. Incertain implementations, the real values included within the realportion 502 may represent a magnitude for samples of an audio signalcontaining the predetermined portion 500 (e.g., for one or more audiosamples included within the predetermined portion 500). Additionally oralternatively, the imaginary values included within the imaginaryportion 504 may represent an amount of phase shift for samples of anaudio signal containing the predetermined portion 500. For example, theimaginary values may represent a phase delay at passband (e.g., whentransmitted) after the predetermined portion 500 is modulated onto acarrier frequency of the audio transmission. Accordingly, the realportion 502 and the imaginary portion 504 may, in combination, formcomplex values.

As depicted, the predetermined portion 500 is a continuous signal withcontinuously-changing real values and imaginary values. In alternativeimplementations, the predetermined portion may include symbolscontaining the predetermined real values and imaginary values. Forexample, the predetermined portion 500 may include 2, 6, 8, or 16symbols. In such implementations, each of the symbols may contain one ormore complex values.

Additionally, as depicted, the predetermined portion 500 includes both areal portion 502 and an imaginary portion 504. For example, where thepredetermined portion 500 is part of an audio transmission that is to betransmitted, the predetermined portion 500 may be generated to includeboth the real portion 502 and the imaginary portion 504. However, incertain implementations, the predetermined portion 500 may not includethe imaginary portion 504. For example, audio transmissions may betransmitted as real-valued signals. In particular, the predeterminedportion may be transmitted by taking the real portion 502 of thepredetermined portion 500 (e.g., with the imaginary portion 504 actingas a delay). Accordingly, the predetermined portion 500 of a receivedaudio transmission may lack the imaginary portion 504. In suchinstances, the real portion 502 of the predetermined portion 500 may beutilized, as explained further below.

FIG. 5B illustrates plots 510, 520 of a predetermined portion of anaudio transmission according to an exemplary embodiment of the presentdisclosure. The plot 510 depicts the amplitude of the real and imaginaryportions of the predetermined portion over time and the plot 520 depictsthe amplitude and phase of the predetermined portion over time. As seenin the legend 512, the plot 510 includes a line 514 representing theamplitude of the real portion of the predetermined portion and a line516 representing the amplitude of the imaginary portion of thepredetermined portion. As seen with the lines 514, 516, the real andimaginary portions are out of sequence, with the imaginary portionpeaking after the real portion. Furthermore, over time, the frequency ofboth the real portion and the imaginary portion increases, causing thelines 514, 516 to oscillate closer together at the end of the plot 510.Turning to the plot 520, as seen in the legend 522, the plot 520includes a line 524 representing the amplitude of the predeterminedportion over time and a line 526 representing the phase of thepredetermined portion over time. As the line 524 demonstrates, theamplitude of the predetermined portion 500 remains at 1 throughout theplot 520, which is why both the real and imaginary portions oscillatebetween +1 and -1 within the plot 510. In the line 526, the phase of thepredetermined portion increases from a minimum of -_(Π) radians to amaximum of +_(Π) radians between maximum and minimum peaks of theimaginary portion of the predetermined portion before returning to -_(Π)radians. Accordingly, the phase change cycle depicted in line 526repeats twice for every cycle of the imaginary portion depicted in line516. Therefore, as the frequency of the predetermined portion increasesover time, the line 526 compresses closer together than the lines 514,516 within the plot 520.

In the above examples, the predetermined portion is depicted anddiscussed as a continuous linear phase chirp signal. Specifically, thepredetermined portion was generated as a signal s(n), where:

s(n) = e^(j2πϕ(n)), with

$\phi(n) = {\sum\limits_{i = 0:n}\frac{f(i)}{f_{start}}},\text{and}$

$f(n) = f_{start} + n \ast \left( \frac{f_{end} - f_{start}}{N_{samples}} \right)$

where:

-   ϕ(n) is the phase for sample n;-   f(n) is the frequency for sample n;-   f_(start) is the starting frequency of the predetermined portion;-   f_(end) is the ending frequency of the predetermined portion; and-   N_(samples) is the total number of samples in the predetermined    portion.

It should be understood that additional or alternative implementationsof the predetermined portion may be possible. For example, thepredetermined portion may include an increasing frequency over time (asdepicted), a decreasing frequency over time, or both. Similarly, thepredetermined portion may include phase change cycles that increase (asdepicted), decrease, or both. In implementations where the predeterminedportion is implemented as a series of symbols, each symbol may have adifferent type of signal with different frequency and/or phase changecycle characteristics. As a further example, the predetermined portionmay be implemented as a complex single tone and/or as a phase-shiftkeying (PSK) signal.

FIGS. 6A-6C illustrate comparisons 600, 620, 630 of a received audiosignal to an expected sequence 602. The received audio signal includes apredetermined portion 604, which may be an exemplary implementation ofany of the predetermined portions discussed in the present disclosure.The received audio signal also includes earlier audio data 606 and lateraudio data 608. The earlier audio data 606 may include audio signalsreceived by a receiver before the predetermined portion 604. Forexample, and where the predetermined portion 604 indicates the beginningof an audio transmission, the earlier audio data 606 may includeenvironmental sounds and/or interference unrelated to the predeterminedportion 604 and unrelated to an audio transmission included within thereceived audio signal. Furthermore, received audio signals may containonly real values and may not contain imaginary values.

In the comparisons 600, 620, 630, the received audio signal is comparedto an expected sequence 602. The expected sequence 602 may contain animaginary portion and a real portion, similar to the predeterminedportion 500 discussed above. In particular, the expected sequence 602may contain a series of continuous complex values with which thepredetermined portion 604 was initially transmitted (e.g., by anothercomputing device to the computing device receiving the audio signal).The received audio signal may be compared to the expected sequence 602by computing a similarity measure 610, 622, 632 between the expectedsequence 602 and a corresponding portion of the received audio signal.In the comparison 600, the similarity measure 610 is computed betweenthe expected sequence 602 and the predetermined portion 604. In thecomparison 620, the similarity measure 622 is calculated between theexpected sequence 602 and a portion of the earlier audio data 606 andthe predetermined portion 604. In the comparison 630, the similaritymeasure 632 is calculated between the expected sequence 602 and aportion of the later audio data 608 and the predetermined portion 604.

The similarity measures 610, 622, 632 may be computed to measure thesimilarity between the complex values of the expected sequence 602 andthe real values from the corresponding portion of the received audiosignal. For example, the similarity measures 610, 622, 632 may becalculated as a correlation between the expected sequence 602 and thecorresponding portion of the received audio signal. For example, andwith reference to the signal s(n) of the predetermined portion 500discussed above, the received audio signal y(n) may be received as:

y(n) = Real(x(n) * e^(−jθ_(channel))), with

x(n) = Real(s(n))

where:

-   θ_(channel) is a phase delay caused by the audio channel through    which the audio transmission was sent and received (e.g., the    physical environment, the transmitter, and the receiver).

In such instances, the similarity measures, y_(corr)(n), may becalculated as:

y_(corr)(n) = conv(y(n), s(n))

where:

-   conv() is a measure of the convolution between two signals.

Comparisons may be performed at multiple times to detect when an audiotransmission is received. For example, comparisons may be performed on acontinuous basis (e.g., at regular intervals) for audio signals receivedby a receiver, such as the receivers 302A-H. In particular, similarityvalues 610, 622, 632 may be calculated between the expected sequence 602and the most recently received portion of the received audio signal thatis the same length as the expected sequence 602 (e.g., a sliding windowof the audio signal with the same length as the expected sequence 602).In practice, as depicted in the comparisons 620, 630, comparisons may beperformed for audio signals received before and/or after thepredetermined portion 604. Accordingly, the similarity measures 622, 632may be lower than the similarity measure 610, indicating that thepredetermined portion 604 is contained within the corresponding part ofthe received audio signal for the comparison 600.

As a further example, FIG. 6D depicts a plot 640 of similarity measuresfor different windows of a received audio signal. In particular, asindicated by the legend 642, the plot 640 includes a line 644corresponding to a similarity measure using only real values of theexpected sequence 602 and a line 646 corresponding to a similaritymeasure using complex values of the expected sequence 602. Inparticular, the lines 644, 646 depict similarity measures for slidingwindows of a received audio signal starting at different times. As canbe seen, the line 644 includes multiple peaks, with a maximum value atpoint 648. The line 646 follows a similar shape as the line 644, butwith many more intervening peaks between the local maximum of the line644 caused by the changes in phase reflected in and/or detected usingthe complex values.

In practice, the intervening peaks of similarity measures utilizingcomplex values enable greater time resolution, allowing forsubstantially improved determination of when an audio transmission isreceived and detected. However, computing similarity measures 610, 622,632 using both real and imaginary values of the expected sequence 602may require substantially more computing resources than computingsimilarity measures 610, 622, 632 using only real values of the expectedsequence. Specifically, computing correlations between a received audiosignal and both the real and imaginary portions of the expected sequence602 may be more than twice as computationally expensive than computingcorrelations between the received audio signal and only the real portionof the expected sequence 602. Furthermore, to compute similaritymeasures for the real and imaginary portions of the expected sequence,both magnitude and phase of the received audio signal must be extracted,rather than just extracting magnitude when using only the real portion.Therefore, it may be impractical to compare the imaginary portion of theexpected sequence to all incoming audio signals. In particular, solelyperforming comparisons of the imaginary portions may utilize too muchprocessing power to enable a computing device to accurately receive andprocess audio signals from multiple receivers, such as the receivers302A-H. As another example, where the computing device receiving theaudio signal is a mobile computing device, solely performing comparisonsof the imaginary portions may unduly impact battery life orresponsiveness of the mobile computing device.

Therefore, the similarity measures calculated on a rolling basis may beinitially calculated using real values (e.g., the similarity measuresindicated by the line 644) of the expected sequence 602 to reduceoverall computing resource utilization. In practice, therefore, becauseof the reduced time resolution of the line 644 and the real-valuedsimilarity measures, an incorrect value may be identified as the maximumsimilarity value, resulting in an incorrect estimation of the arrivaltime for the audio transmission. For example, similarity measures may becalculated at regular time intervals (e.g., every 1 ms, 10 ms, 20 ms, or50 ms) for incoming audio signals. In one specific example, twoconsecutive similarity measures may be represented by the points 654,656 on the line 644. Because the point 654 has a larger similaritymeasure than the point 656, the point 654 may be used to identify thetime of detection for the predetermined portion 604. However, the point654 has a smaller similarity measure than the point 648, and using thepoint 654 may result in an incorrect time of detection for thepredetermined portion 604, resulting in erroneous future processing ofthe audio transmission containing the predetermined portion 604. Suchscenarios may be more likely when phase shifts (e.g., delays of that aresmaller than the duration of a sample of the audio signals) areintroduced to the audio signals. In such instances, the plot 640 may beadjusted such that peaks adjacent to the maximum peak at point 648 ofthe line 644 may be higher in magnitude, resulting in a furthermisidentification of the correct time of detection.

In certain implementations, after detecting a similarity measure thatexceeds a predetermined threshold, such as at the point 654, moredetailed processing may take place. For example, similarity measures maybe computed for multiple samples before and after the point 654.Performing such analysis using similarity measures calculated based onreal values of the expected sequence alone may be insufficient to mostaccurately identify when an audio transmission is detected. For example,similarity measures computed based on real values may be unable toaccurately identify the start time of the predetermined portion 500 withaccuracies of, e.g., less than 0.2 ms, 0.1 ms, 0.05 ms, 0.01 ms. As aspecific example, the points 650, 652 may correspond to similarityvalues for sliding windows beginning on consecutive samples of thereceived audio signal. Because the similarity measure at point 650 islarger than the similarity measure at point 652, the point 652 may beidentified as the point with the highest similarity measure. However,the point 650 still has a lower similarity measure than the point 648,resulting in a less accurate detection time for the predeterminedportion 604. For example, the points 648, 650 may be more than 0.05 msapart. In certain implementations, e.g., for high-bandwidth audiotransmissions, for audio transmissions received over long distances, foraudio transmissions received in a noisy environment, and/or fordistinguishing between multiple received audio transmissions, accuraciesof 0.05 ms or less may be necessary to ensure accurate subsequentprocessing of received audio transmissions.

Therefore, similarity measures using complex values (e.g., real valuesand imaginary values) of the expected sequence 602 may also be analyzed.For example, similarity measures may be calculated for windows beginningor ending with samples before and after the point 654 (e.g., for a rangeof windows beginning or ending with consecutive samples before and afterthe point 654). In particular, similarity measures may be calculatedwithin a predetermined threshold time difference (e.g., a predeterminedamount of time and/or a predetermined number of samples before and/orafter the point 654). For example, similarity measures may be calculatedfor windows beginning with samples that occur within 15 samples beforeor after the point 654. As another example, similarity measures may becalculated for windows beginning within 1 ms (or 0.5 ms) before or afterthe point 654. The similarity measures may be calculated between complexvalues of the expected sequence and the real values of the receivedaudio signal using the convolution formulation discussed above. Inpractice, because of the faster changes to the phase of the audio signal(e.g., as indicated by the line 526) and corresponding improved timeresolution in the line 646, utilizing complex values may enable moreprecise detection of where the largest similarity measure occurs.Specifically, by utilizing information regarding the phase, similaritymeasures calculated based on complex values may allow for sub-sampleresolution (e.g., the detection of maximum similarity measures thatoccur between samples of the audio signal), which can help detect anaccurate arrival time even with sub-sample phase shifts are introducedto the audio signal. Accordingly, in the above-discussed scenario wherethe points 648, 652 represent consecutive samples, similarity measurescalculated based on the complex values of the expected sequence 602 mayresult in calculating a similarity measure at or near the point 648,even though the point 648 occurs between two samples. By detecting thissub-sample occurrence of a similarity measure, the accuracy of detectionfor when the predetermined portion 604 was received may be improved.Such implementations may therefore enable the 0.05 ms or less accuraciesthat may be required to accurately process audio transmissions incertain instances.

The above-discussed techniques combine the processing benefits of thereal portion comparisons with the accuracy benefits of the imaginaryportion comparisons. Specifically, by relying primarily on thecomparisons of the real portions of received audio signals to initiallydetect the predetermined portion of an audio transmission and performingfurther comparisons of the imaginary portions to more accuratelyidentify precisely when the audio transmission is received, theabove-discussed techniques reduce the overall computing resourcesrequired as compared to utilizing only similarity measures calculatedbased on complex values of the expected sequence while also enabling theimproved precision that the similarity measures provide once apredetermined portion 604 has been detected. Therefore, these techniquesmay enable improved precision of audio transmission detection while alsoimposing negligible impacts on overall device processing power andbattery life.

In FIGS. 5A-5B and 6A-6D, the examples discussed utilized expectedsequences and predetermined portions that were continuous signals.However, in certain instances, the expected sequences and predeterminedportions may be implemented using discrete symbols (e.g., 2, 6, 8, or 16symbols) within the audio transmission. In certain implementations,expected sequences and predetermined portions implemented using discretesymbols may be shorter in duration (e.g., 2 symbols/4 ms, 5 symbols/10ms) than continuous signals (e.g., 40 ms). In implementations usingdiscrete symbols, similarity measures may be calculated for each symbolindividually (e.g., each symbol within the expected sequence) and may becombined to form a single similarity measure overall.

For example, FIGS. 7A-7C illustrate comparisons 700, 720, 730 of complexvalues 702, 704, 706 of an expected sequence 701 to a received audiosignal 719 according to an exemplary embodiment of the presentdisclosure. The expected sequence 701 includes symbols 1-3, whichcontain the complex values 702, 704, 706. The complex values 702, 704,706 may include one or more real- and imaginary-valued audio signals,similar to the real portion 502 and the imaginary portion 504 of thepredetermined portion 500. The received audio signal 719 includessymbols 1-3 containing real values 708, 710, 712. For example, thesymbols 1-3 may correspond to the symbols 1-3 of the expected sequence701. The real values 708, 710, 712 contained within the symbols 1-3 maycontain one or more real-valued audio signals. Similar to the receivedaudio signal 719 in the comparisons 600, 620, 630, the received audiosignal 719 includes earlier audio data 738 and later audio data 728,which may include, e.g., environmental audio data, audio datacorresponding to additional symbols within the a predetermined portionof the received audio signal 719, and/or audio data corresponding toearlier or later portions of the audio transmission.

Turning to FIG. 7A, in the comparison 700, similarity measures 714, 716,718 are calculated between the expected sequence 701 and the audiosignal 719. In particular, the comparison 700 is performed at a timewhen the symbols 1-3 of the audio signal 719 align with the symbols 1-3of the expected sequence 602. Therefore, the similarity measure 714 iscalculated between the complex value 702 of symbol 1 of the expectedsequence 602 and the real value 708 of symbol 1 of the audio signal 719.Similarly, the similarity measure 716 is calculated between the complexvalue 704 of symbol 2 of the expected sequence 602 and the real value710 of symbol 2 of the audio signal 719. Further, the similarity measure718 is calculated between the complex value 706 of symbol 3 of theexpected sequence 602 and the real value 712 of symbol 3 of the audiosignal 719. The similarity measures 714, 716, 718 may be calculated todetermine how closely the complex values of the expected sequence 701and the real values 708, 710, 712 of the audio signal 719 resemble oneanother. For example, the similarity measures 714, 716, 718 may becalculated as a correlation (e.g., a convolution measure) between thevalues that are compared, as discussed above. The similarity measures714, 716, 718 may be combined into a combined similarity measure betweenthe expected sequence 602 and the audio signal 719.

Comparisons similar to the comparison 700 may be performed at differentstarting times for the compared audio signal. For example, in thecomparison 720, a later portion of the audio signal 719 is compared tothe expected sequence. In particular, the later portion of the audiosignal 719 includes the later audio data 728. Because the later portionis compared, the similarity measures 722, 724, 726 computed during thecomparison may not correctly align between the symbols 1-3 of theexpected sequence 701 and the audio sequence 729. In particular, thesimilarity measure 722 is computed between the complex value 702 of theexpected sequence 602 and portions of the real values 708, 710; thesimilarity measure 724 is computed between the complex value 704 andportions of the real values 710, 712; and the similarity measure 726 iscomputed between the complex value 706, the real value 712, and thelater audio data 728. The similarity measures 722, 724, 726 may becomputed similarly to the similarity measures 714, 716, 718 and maysimilarly combined to generate a combined similarity measure for thecomparison 720.

Based on the improper alignment of the symbols, the similarity measures722, 724, 726 may be lower than the similarity measures 714, 716, 718because non-corresponding symbols are being compared and may thereforediffer more. In particular, the real values 708, 710, 712 may differfrom the complex values 702, 704, 706 of the expected sequence 701 afterbeing transmitted (e.g., due to interference or other distortions).However, the differences in the comparison 700 may be smaller than thedifferences between the misaligned symbols 1-3 of the audio signal 719because portions of real values 710, 712 that occur later and lateraudio data 728 that do not correspond to the correct symbols 1-3 of theexpected sequence 701 are being compared to the imaginary values 702,704, 706. Therefore, because the audio signal 719 is improperly alignedwith the expected sequence 602 in the comparison 720, the combinedsimilarity measure for the comparison 720 may be lower than the combinedsimilarity measure for the comparison 700.

As another example, in the comparison 730, the expected sequence 602 iscompared to an earlier portion of the audio signal 719 that includes theearlier audio data 738 that occurs before the symbols 1-3. Similar tothe later audio data 728, the earlier audio data 738 may cause thesymbols 1-3 of the audio signal 719 to improperly align with the symbols1-3 of the expected sequence 701 when calculating the correspondingsimilarity measures 732, 734, 736. In particular, the similarity measure732 is calculated between the complex value 702, the earlier audio data738 and a portion of the real value 708; the similarity measure 734 iscalculated between the complex value 704 and portions of the real values708, 710; and the similarity measure 736 is calculated between thecomplex value 706 and portions of the real values 710, 712. Thesimilarity measures 732, 734, 736 may be calculated using techniquescomparable to those used to calculate the similarity measures 714, 716,718 and may be combined to generate a combined similarity score for thecomparison 730. As explained above, because the symbols 1-3 of the audiosignal 719 do not align with the symbols 1-3 of the expected sequence602, the similarity measures 732, 734, 736 may be lower than thesimilarity measures 714, 716, 718 and therefore the combined similaritymeasure for the comparison 730 may be lower than the combined similaritymeasure for the comparison 700.

Although only three comparisons 700, 720, 730 are depicted and discussedabove, in practice, additional or fewer comparisons may be performed.For example, as discussed above, the comparisons may be repeated for apredetermined period of time before and/or after the initial estimate ofthe starting time of the predetermined portion. As a specific example,the comparisons may be repeated at an initial estimate of the arrivaltime (e.g., based on a comparison of the real values of the expectedsequence 701 with the real values 708, 710, 712 of the received audiosignal 719), at ten different times before the initial estimate, and atten different times after the initial estimate. In alternativeimplementations, the comparisons may be repeated five, 20, or 50 timesbefore and/or after an initial estimate of the arrival time. Thecomparisons may be repeated at regular time intervals (e.g., every 0.02ms, every 0.05 ms, every 0.1 ms) before and/or after the initialestimate. Additionally or alternatively, the comparisons may be repeatedto begin with particular audio samples (e.g., each of the predeterminednumber before and/or after the initial estimate, every second or thirdsample before and/or after the initial estimate). In this way, thecomparisons 700, 720, 730 of the imaginary portions of the expectedsequence and a received audio signal may be more granular and precisethan comparisons of the real portions.

As explained above, comparing the imaginary portions may allow for moreaccurate similarity measures that are capable of determining differencesin alignment of the symbols 1-3 for smaller time differences thancomparing the real portions. Accordingly, the comparisons 700, 720, 730may be performed after multiple comparisons using real values of theexpected sequence 701 are performed to initially estimate an arrivaltime of an audio transmission. For example, after an initial estimate ofthe starting time of the predetermined portion is determined based oncomparisons of the real values of an audio signal containing the audiotransmission, comparisons 700, 720, 730 may be performed for a smallerrange of times surrounding the initial estimate of the starting time,similar to the comparisons 600, 620, 630 discussed above.

FIG. 8 illustrates a computing device 800 according to an exemplaryembodiment of the present disclosure. The computing device 800 may be anexemplary implementation of one of the computing devices 102, 104. Inparticular, the computing device 800 may be configured to receive andprocess audio signals to detect when an audio signal contains an audiotransmission and to precisely determine when the audio transmissionbegins within the signal. For example, the computing device 800 may beconfigured to perform one or more of the comparisons 600, 620, 630, 700,720, 730 discussed above.

The computing device 800 includes an audio signal 802, an expectedsequence 814, similarity measures 820, 824, comparison times 822, 826,and a transmission detection time 828. The audio signal 802 may bereceived from a receiver, such as one of the receivers 302A-H. Incertain implementations, the computing device 800 may receive multipleaudio signals from multiple receivers and the audio signal 802 may be anexemplary implementation of one of the audio signals. The audio signal802 may contain an audio transmission 804, and the computing device 800may be configured to detect the audio transmission 804 within the audiosignal 802. For example, the computing device 800 may continuouslyreceive the audio signal 802 from a receiver, but the audio signal 802may only occasionally include audio transmissions 804 received fromother computing devices. The computing device 800 may therefore beconfigured to detect when the audio signal 802 includes an audiotransmission 804 (e.g., in order to synchronize further processing ofthe audio transmission 804 with the correct portion of the audio signal802).

The audio transmission 804 contains a predetermined portion 806 and data812. The data 812 may be data intended for transmission with the audiotransmission 804 and may be stored within a payload of the audiotransmission 804. The predetermined portion 806 may be a portion of theaudio transmission 804 that was generated to include a known orpredetermined sequence of symbols, as discussed above. In particular,the predetermined portion 806 may include a real portion 810 containingreal-valued signals.

The predetermined portion 806 may correspond to the expected sequence814. In particular, the predetermined portion 806 may be initiallygenerated and transmitted based on the same or similar symbols as theexpected sequence 814. For example, the expected sequence 814 includesboth a real portion 816 and an imaginary portion 818. The predeterminedportion 806 may be generated, prior to transmission, to include symbolsthat have the same real portion 816 as the expected sequence 814 and tohave phase delays based on the imaginary portion 818.

However, after the audio transmission 804 is transmitted from the othercomputing device and received by the computing device 800, thepredetermined portion 806 may differ from the expected sequence 814(e.g., due to channel delay, interference, and/or distortion).Therefore, the real portion 810 may differ from the real portion 816and/or may have been phase shifted (e.g., by the audio channel), causingdifferences between the predetermined portion 806 and the imaginaryportion 818. It may therefore be necessary to compare the predeterminedportion 806 to the expected sequence 814 to determine when thepredetermined portion 806 occurs within the audio transmission 804 and,by extension, to determine the transmission detection time 828indicating when the audio transmission 804 begins within the audiosignal 802.

The computing device 800 may initially compare the real portions 810,816 of the predetermined portion and the expected sequence 814 todetermine the transmission detection time 828. Specifically, thecomputing device 800 may compare the real portions 810, 816 to determinean initial estimate 821 of the transmission detection time 828. Forexample, the computing device 800 may calculate one or more similaritymeasures 820 between the real portions 810, 816 at one or morecomparison times 822. As a specific example, the computing device 800may calculate the similarity measures 820 as one or more correlationmeasures (e.g., convolutions) between the real portions 810, 816 atdifferent comparison times 822. In particular, the computing device 800may generate similarity measures 820 between the real portion 816 of theexpected sequence 814 and a real portion (e.g., real values) of theaudio signal 802 at regular intervals. The similarity measures 820 mayinclude similarity measures for individual symbols within the realportion 816 and/or combined similarity measures that correspond toindividual comparison times 822, as discussed above. In certaininstances, the comparison times 822 may correspond to a particular timestamp for a window (e.g., a sliding window) of the audio signal 802 thatis compared to the expected sequence 814. For example, the comparisontimes 822 may represent a starting timestamp for the window, an endingtimestamp for the window, a middle timestamp for the window, or anyother timestamp for the window.

The computing device 800 may initially detect the audio transmission 804based on the comparisons of the real portions 810, 816. For example,where the computing device 800 compares the real portion 816 of theexpected sequence 814 to the audio signal 802 at regular intervals, thecomputing device 800 may detect the audio transmission 804 when at leastone of the similarity measures 820 exceeds a predetermined threshold.Upon detecting the audio transmission 804, the computing device 800 maydetermine an initial estimate 821 of the time of arrival for thestarting time of the predetermined portion 806 within the audio signal802. For example, the initial estimate 821 of the time of arrival may bedetermined for the first comparison time 822 at which a correspondingsimilarity measure 820 (e.g., a combined similarity measure) exceeds thepredetermined threshold. In additional or alternative implementations,the computing device 800 may continue to compute similarity measuresafter (e.g., for a predetermined period of time after) a similaritymeasure 820 exceeds the predetermined threshold. The computing device800 may, in such implementations, determine the initial estimate 821 tobe the comparison time 822 corresponding to the largest similaritymeasure 820.

The computing device 800 may also compare the real portion 806 to boththe real portion 816 and the imaginary portion 818 of the expectedsequence 814 to determine the transmission detection time 828. Forexample, the computing device 800 may compute one or more similaritymeasures 824 between the real portion 806 and both the real portion 816and the imaginary portion 818 at one or more comparison times 826. As aspecific example, the computing device 800 may perform one or more ofthe comparisons 600, 620, 630, 700, 720, 730. Similar to the comparisontimes 822, the comparison times 826 may correspond to a particular timestamp for a window (e.g., a sliding window) of the audio signal 802 thatis compared to the expected sequence 814. For example, the comparisontimes 826 may represent a starting timestamp for the window, an endingtimestamp for the window, a middle timestamp for the window, or anyother timestamp for the window. In certain instances, the computingdevice 800 may compute the similarity measures 824 at comparison times826 selected based on the initial estimate 821. For example, and asdiscussed above, the comparison times 826 may be selected to occurwithin a predetermined threshold time difference of the initial estimate821 (e.g., to include a predetermined period of time and/orpredetermined number of audio samples before and/or after the initialestimate 821 of the arrival time). The comparison times 826 maytherefore cover a smaller range of times than the comparison times 822.Where the expected sequence 814 is a continuous signal (e.g., similar tothe expected sequence 602), the similarity measures 824 may becalculated for the signal as a whole, as discussed above in connectionwith the comparisons 600, 620, 630. Where the expected sequence 814includes one or more symbols (e.g., similar to the expected sequence701), the similarity measures 824 may include similarity measures forindividual symbols within the imaginary portion 818 and/or combinedsimilarity measures that correspond to individual comparison times 826,as discussed above in connection with the comparisons 700, 720, 730.

The transmission detection time 828 may be selected based on thesimilarity measures 824. For example, the transmission detection time828 may be selected as the detection time 826 with the largestcorresponding similarity measure 824. As explained above, comparing boththe real portion 816 and the imaginary portion 818 of the expectedsequence 814 to the real portion 810 of the predetermined portion 806enables a more precise detection of when the predetermined portion 806of the audio signal 802 most closely resembles the expected sequence 814and, by extension, when the audio transmission 804 arrived. As alsoexplained above, only performing comparisons utilizing both the realportion 816 and the imaginary portion 818 of the expected sequence 814may be computationally expensive, so balancing such comparisons withcomparison of the real portions 810, 816 may enable the more precisedetection of the arrival of the audio transmission 804 without undulyincreasing the computational resource requirements.

In certain implementations, the transmission detection time 828 may bedetermined to be the time at which the predetermined portion 806 beginswithin the audio signal 802. For example, the comparison times 826 mayindicate the starting time of a portion of the audio signal that iscompared to the imaginary portion 818 and the transmission detectiontime 828 may be selected to identify the starting time that correspondsto the largest similarity measure 824. Additionally or alternatively,the transmission detection time 828 may be determined to be the time atwhich the audio transmission 804 begins. For example, if thepredetermined portion 806 does not occur at the beginning of the audiotransmission 804, the computing device 800 may adjust the starting timeindicated by the comparison time 826 to indicate the beginning time ofthe audio transmission 804. In particular, the predetermined portion 806may occur at a particular time difference (e.g., a particular durationof time and/or number of symbols) after the beginning of the audiotransmission 804. The starting time indicated by the comparison time 826corresponding to the highest similarity measure 824 may therefore beadjusted by subtracting the particular time difference in order toindicate the beginning time of the audio transmission 804. In additionalor alternative implementations, the transmission detection time 828 maybe determined to be a time at which a different portion of the audiotransmission 804 begins. For example, the transmission detection time828 may be selected to indicate when the portion of the audiotransmission 804 storing the data 812 begins. In such instances,techniques similar to those discussed above may be used to adjust thestarting time indicated by the comparison time 826 to identify thebeginning time of the data 812 within the audio transmission 804.

Once the transmission detection time 828 is detected, the audiotransmission 804 may be extracted from the audio signal 802 for furtherprocessing. For example, the computing device 800 may extract apredetermined length of the audio signal 802 corresponding to all orpart of the audio transmission 804 and may use the extracted portion ofthe audio signal 802 in further processing of the audio transmission804.

It should be understood that any of the times discussed herein (e.g.,comparison times, transmission detection times, initial estimates of thetime of arrival of an audio transmission) may include times storedand/or computed as absolute times (e.g., a particular time and date),relative times (e.g., time since a predetermined fixed point, such asthe beginning of the hour, week, day), and an audio sample count (e.g.,a numerical identifier of a particular audio sample corresponding to therelevant point in time). In certain implementations, the similaritymeasures discussed herein may be calculated as difference measures(e.g., a measure of differences between audio signals instead of ameasure of similarity). In such implementations, any of the embodimentsdiscussed herein that include determining when a similarity exceeds apredetermined threshold may be implemented by determining when adifference measure falls below a predetermined threshold. Further, anyof the embodiment discussed herein that include identifying and/orselecting a time corresponding to a largest similarity measure may beimplemented by identifying and/or selecting a time corresponding to asmallest difference measure.

FIG. 9 illustrates a method 900 according to an exemplary embodiment ofthe present disclosure. The method 900 may be performed to determine thearrival time of an audio transmission within an audio signal. The method900 may be implemented on a computer system, such as the computingdevice 800. The method 900 may also be implemented by a set ofinstructions stored on a computer readable medium that, when executed bya processor, cause the computer system to perform the method 900. Forexample, all or part of the method 900 may be implemented by a processorand a memory of the computing device 800. Although the examples beloware described with reference to the flowchart illustrated in FIG. 9 ,many other methods of performing the acts associated with FIG. 9 may beused. For example, the order of some of the blocks may be changed,certain blocks may be combined with other blocks, one or more of theblocks may be repeated, and some of the blocks described may beoptional.

The method 900 begins with receiving an audio signal containing an audiotransmission (block 902). For example, the computing device 800 mayreceive an audio signal 802 containing audio transmission 804. The audiotransmission 804 may include a predetermined portion 806 initiallygenerated to be identical to an expected sequence 814. The transmission804 may additionally include data 812 for transmission using the audiotransmission 804.

Similarity measures may be computed based on a real portion of anexpected sequence (block 904). For example, the computing device 800 maycompute similarity measures 820 between a real portion 816 of theexpected sequence 814 and a real portion of the audio signal 802. Inparticular, the similarity measures 820 may be computed at multiplecomparison times 822 (e.g., at regular intervals based on the receivedaudio signal 802) in order to initially detect the audio transmission804. A first time may then be identified (block 906). For example, afirst time indicating an initial estimate 821 of an arrival time of theaudio transmission 804 may be identified. In particular, in certaininstances, the first time may be identified based on the similaritymeasures 820 between the real portions 810, 816. For example, the firsttime may be identified as the first comparison time 822 that correspondsto a similarity measure 820 that exceeds a predetermined thresholdand/or may be identified as the comparison time 822 corresponding to thelargest similarity measure 820, as discussed above.

Similarity measures may be computed based on an imaginary portion of theexpected sequence (block 908). For example, the computing device 800 maycompute similarity measures 824 based on the imaginary portion 818 ofthe expected sequence 814. For example, the similarity measures 824 maybe computed to indicate a similarity of the real portion 810 of thepredetermined portion 806 to both the real portion 816 and the imaginaryportion 818 of the expected sequence 814. The similarity measures 824may be computed at one or more comparison times 826. The comparisontimes 826 may be selected based on the first time identified at block906. For example, the comparison times 826 for which similarity measures824 are computed may be selected to include one or more of apredetermined range of times and/or a predetermined range of audiosamples before and/or after the first time identified at block 906. Asecond time may then be identified (block 910). For example, the secondtime may be identified based on the similarity measures 824. As aspecific example, the second time may be identified to indicate thecomparison time 826 that corresponds to the largest similarity measure824.

The beginning of the audio transmission may be determined based on thesecond time (block 912). For example, the computing device 800 maydetermine the beginning of the audio transmission 804 within the audiosignal 802 based on the second time. As a specific example, thecomputing device 800 may determine a transmission detection time 828based on the second time. As explained above, the transmission detectiontime 828 may be determined to indicate the beginning of the audiotransmission 804 within the audio signal 802. In certain instances,rather than determining the beginning of the audio transmission, thecomputing device 800 may determine a beginning of a portion of the audiotransmission 804, such as a beginning of the predetermined portion 806and/or a beginning of a portion of the audio transmission 804 containingthe data 812 and/or other information (e.g., a header of the audiotransmission 804). Upon determining the beginning of the audiotransmission 804, all or part of the audio transmission 804 may beextracted from the audio signal 802 for further processing.

By utilizing comparisons of both real portions and imaginary portions ofthe expected sequence 814 in the comparisons with the audio transmission804, the method 900 is able to both capture the computationalefficiencies of performing real-valued comparisons and the improvedaccuracy of performing imaginary-valued comparisons. In this way, themethod 900 is able to improve the accuracy with which audiotransmissions 804 are identified within audio signals 802, withoutcompromising the battery life and/or computing processing requirementsof the computing device 800. Furthermore, by increasing the accuracywith which audio transmissions 804 are detected and identified, themethod 900 may improve the ability for computing devices connected totransmitter/receiver arrays 300 to distinguish between audiotransmissions received from different directions. For example, by moreprecisely determining the transmission detection times 828, computingdevices may be better able to discern when different audio transmissionsare received and may therefore be less likely to combine audio signalscontaining different audio transmissions, which would reduce theaccuracy of subsequent processing of the audio transmissions.

FIG. 10 illustrates an example computer system 1000 that may be utilizedto implement one or more of the devices and/or components of FIGS. 1 and8 , such as the computing devices 102, 104, 800. In particularembodiments, one or more computer systems 1000 perform one or more stepsof one or more methods described or illustrated herein. In particularembodiments, one or more computer systems 1000 provide thefunctionalities described or illustrated herein. In particularembodiments, software running on one or more computer systems 1000performs one or more steps of one or more methods described orillustrated herein or provides the functionalities described orillustrated herein. Particular embodiments include one or more portionsof one or more computer systems 1000. Herein, a reference to a computersystem may encompass a computing device, and vice versa, whereappropriate. Moreover, a reference to a computer system may encompassone or more computer systems, where appropriate.

This disclosure contemplates any suitable number of computer systems1000. This disclosure contemplates the computer system 1000 taking anysuitable physical form. As example and not by way of limitation, thecomputer system 1000 may be an embedded computer system, asystem-on-chip (SOC), a single-board computer system (SBC) (such as, forexample, a computer-on-module (COM) or system-on-module (SOM)), adesktop computer system, a laptop or notebook computer system, aninteractive kiosk, a mainframe, a mesh of computer systems, a mobiletelephone, a personal digital assistant (PDA), a server, a tabletcomputer system, an augmented/virtual reality device, or a combinationof two or more of these. Where appropriate, the computer system 1000 mayinclude one or more computer systems 1000; be unitary or distributed;span multiple locations; span multiple machines; span multiple datacenters; or reside in a cloud, which may include one or more cloudcomponents in one or more networks. Where appropriate, one or morecomputer systems 1000 may perform without substantial spatial ortemporal limitation one or more steps of one or more methods describedor illustrated herein. As an example and not by way of limitation, oneor more computer systems 1000 may perform in real time or in batch modeone or more steps of one or more methods described or illustratedherein. One or more computer systems 1000 may perform at different timesor at different locations one or more steps of one or more methodsdescribed or illustrated herein, where appropriate.

In particular embodiments, computer system 1000 includes a processor1006, memory 1004, storage 1008, an input/output (I/O) interface 1010,and a communication interface 1012. Although this disclosure describesand illustrates a particular computer system having a particular numberof particular components in a particular arrangement, this disclosurecontemplates any suitable computer system having any suitable number ofany suitable components in any suitable arrangement.

In particular embodiments, the processor 1006 includes hardware forexecuting instructions, such as those making up a computer program. Asan example and not by way of limitation, to execute instructions, theprocessor 1006 may retrieve (or fetch) the instructions from an internalregister, an internal cache, memory 1004, or storage 1008; decode andexecute the instructions; and then write one or more results to aninternal register, internal cache, memory 1004, or storage 1008. Inparticular embodiments, the processor 1006 may include one or moreinternal caches for data, instructions, or addresses. This disclosurecontemplates the processor 1006 including any suitable number of anysuitable internal caches, where appropriate. As an example and not byway of limitation, the processor 1006 may include one or moreinstruction caches, one or more data caches, and one or more translationlookaside buffers (TLBs). Instructions in the instruction caches may becopies of instructions in memory 1004 or storage 1008, and theinstruction caches may speed up retrieval of those instructions by theprocessor 1006. Data in the data caches may be copies of data in memory1004 or storage 1008 that are to be operated on by computerinstructions; the results of previous instructions executed by theprocessor 1006 that are accessible to subsequent instructions or forwriting to memory 1004 or storage 1008; or any other suitable data. Thedata caches may speed up read or write operations by the processor 1006.The TLBs may speed up virtual-address translation for the processor1006. In particular embodiments, processor 1006 may include one or moreinternal registers for data, instructions, or addresses. This disclosurecontemplates the processor 1006 including any suitable number of anysuitable internal registers, where appropriate. Where appropriate, theprocessor 1006 may include one or more arithmetic logic units (ALUs), bea multi-core processor, or include one or more processors 1006. Althoughthis disclosure describes and illustrates a particular processor, thisdisclosure contemplates any suitable processor.

In particular embodiments, the memory 1004 includes main memory forstoring instructions for the processor 1006 to execute or data forprocessor 1006 to operate on. As an example, and not by way oflimitation, computer system 1000 may load instructions from storage 1008or another source (such as another computer system 1000) to the memory1004. The processor 1006 may then load the instructions from the memory1004 to an internal register or internal cache. To execute theinstructions, the processor 1006 may retrieve the instructions from theinternal register or internal cache and decode them. During or afterexecution of the instructions, the processor 1006 may write one or moreresults (which may be intermediate or final results) to the internalregister or internal cache. The processor 1006 may then write one ormore of those results to the memory 1004. In particular embodiments, theprocessor 1006 executes only instructions in one or more internalregisters or internal caches or in memory 1004 (as opposed to storage1008 or elsewhere) and operates only on data in one or more internalregisters or internal caches or in memory 1004 (as opposed to storage1008 or elsewhere). One or more memory buses (which may each include anaddress bus and a data bus) may couple the processor 1006 to the memory1004. The bus may include one or more memory buses, as described infurther detail below. In particular embodiments, one or more memorymanagement units (MMUs) reside between the processor 1006 and memory1004 and facilitate accesses to the memory 1004 requested by theprocessor 1006. In particular embodiments, the memory 1004 includesrandom access memory (RAM). This RAM may be volatile memory, whereappropriate. Where appropriate, this RAM may be dynamic RAM (DRAM) orstatic RAM (SRAM). Moreover, where appropriate, this RAM may besingle-ported or multi-ported RAM. This disclosure contemplates anysuitable RAM. Memory 1004 may include one or more memories 1004, whereappropriate. Although this disclosure describes and illustratesparticular memory implementations, this disclosure contemplates anysuitable memory implementation.

In particular embodiments, the storage 1008 includes mass storage fordata or instructions. As an example and not by way of limitation, thestorage 1008 may include a hard disk drive (HDD), a floppy disk drive,flash memory, an optical disc, a magneto-optical disc, magnetic tape, ora Universal Serial Bus (USB) drive or a combination of two or more ofthese. The storage 1008 may include removable or non-removable (orfixed) media, where appropriate. The storage 1008 may be internal orexternal to computer system 1000, where appropriate. In particularembodiments, the storage 1008 is non-volatile, solid-state memory. Inparticular embodiments, the storage 1008 includes read-only memory(ROM). Where appropriate, this ROM may be mask-programmed ROM,programmable ROM (PROM), erasable PROM (EPROM), electrically erasablePROM (EEPROM), electrically alterable ROM (EAROM), or flash memory or acombination of two or more of these. This disclosure contemplates massstorage 1008 taking any suitable physical form. The storage 1008 mayinclude one or more storage control units facilitating communicationbetween processor 1006 and storage 1008, where appropriate. Whereappropriate, the storage 1008 may include one or more storages 1008.Although this disclosure describes and illustrates particular storage,this disclosure contemplates any suitable storage.

In particular embodiments, the I/O Interface 1010 includes hardware,software, or both, providing one or more interfaces for communicationbetween computer system 1000 and one or more I/O devices. The computersystem 1000 may include one or more of these I/O devices, whereappropriate. One or more of these I/O devices may enable communicationbetween a person (i.e., a user) and computer system 1000. As an exampleand not by way of limitation, an I/O device may include a keyboard,keypad, microphone, monitor, screen, display panel, mouse, printer,scanner, speaker, still camera, stylus, tablet, touch screen, trackball,video camera, another suitable I/O device or a combination of two ormore of these. An I/O device may include one or more sensors. Whereappropriate, the I/O Interface 1010 may include one or more device orsoftware drivers enabling processor 1006 to drive one or more of theseI/O devices. The I/O interface 1010 may include one or more I/Ointerfaces 1010, where appropriate. Although this disclosure describesand illustrates a particular I/O interface, this disclosure contemplatesany suitable I/O interface or combination of I/O interfaces.

In particular embodiments, communication interface 1012 includeshardware, software, or both providing one or more interfaces forcommunication (such as, for example, packet-based communication) betweencomputer system 1000 and one or more other computer systems 1000 or oneor more networks 1014. As an example and not by way of limitation,communication interface 1012 may include a network interface controller(NIC) or network adapter for communicating with an Ethernet or any otherwire-based network or a wireless NIC (WNIC) or wireless adapter forcommunicating with a wireless network, such as a Wi-Fi network. Thisdisclosure contemplates any suitable network 1014 and any suitablecommunication interface 1012 for the network 1014. As an example and notby way of limitation, the network 1014 may include one or more of an adhoc network, a personal area network (PAN), a local area network (LAN),a wide area network (WAN), a metropolitan area network (MAN), or one ormore portions of the Internet or a combination of two or more of these.One or more portions of one or more of these networks may be wired orwireless. As an example, computer system 1000 may communicate with awireless PAN (WPAN) (such as, for example, a Bluetooth® WPAN), a Wi-Finetwork, a WI-MAX network, a cellular telephone network (such as, forexample, a Global System for Mobile Communications (GSM) network), orany other suitable wireless network or a combination of two or more ofthese. Computer system 1000 may include any suitable communicationinterface 1012 for any of these networks, where appropriate.Communication interface 1012 may include one or more communicationinterfaces 1012, where appropriate. Although this disclosure describesand illustrates a particular communication interface implementations,this disclosure contemplates any suitable communication interfaceimplementation.

The computer system 1000 may also include a bus. The bus may includehardware, software, or both and may communicatively couple thecomponents of the computer system 1000 to each other. As an example andnot by way of limitation, the bus may include an Accelerated GraphicsPort (AGP) or any other graphics bus, an Enhanced Industry StandardArchitecture (EISA) bus, a front-side bus (FSB), a HYPERTRANSPORT (HT)interconnect, an Industry Standard Architecture (ISA) bus, an INFINIBANDinterconnect, a low-pin-count (LPC) bus, a memory bus, a Micro ChannelArchitecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, aPCI-Express (PCIe) bus, a serial advanced technology attachment (SATA)bus, a Video Electronics Standards Association local bus (VLB), oranother suitable bus or a combination of two or more of these buses. Thebus may include one or more buses, where appropriate. Although thisdisclosure describes and illustrates a particular bus, this disclosurecontemplates any suitable bus or interconnect.

Herein, a computer-readable non-transitory storage medium or media mayinclude one or more semiconductor-based or other types of integratedcircuits (ICs) (e.g., field-programmable gate arrays (FPGAs) orapplication-specific ICs (ASICs)), hard disk drives (HDDs), hybrid harddrives (HHDs), optical discs, optical disc drives (ODDs),magneto-optical discs, magneto-optical drives, floppy diskettes, floppydisk drives (FDDs), magnetic tapes, solid-state drives (SSDs),RAM-drives, SECURE DIGITAL cards or drives, any other suitablecomputer-readable non-transitory storage media, or any suitablecombination of two or more of these, where appropriate. Acomputer-readable non-transitory storage medium may be volatile,non-volatile, or a combination of volatile and non-volatile, whereappropriate.

Herein, “or” is inclusive and not exclusive, unless expressly indicatedotherwise or indicated otherwise by context. Therefore, herein, “A or B”means “A, B, or both,” unless expressly indicated otherwise or indicatedotherwise by context. Moreover, “and” is both joint and several, unlessexpressly indicated otherwise or indicated otherwise by context.Therefore, herein, “A and B” means “A and B, jointly or severally,”unless expressly indicated otherwise or indicated otherwise by context.

The scope of this disclosure encompasses all changes, substitutions,variations, alterations, and modifications to the example embodimentsdescribed or illustrated herein that a person having ordinary skill inthe art would comprehend. The scope of this disclosure is not limited tothe example embodiments described or illustrated herein. Moreover,although this disclosure describes and illustrates respectiveembodiments herein as including particular components, elements,features, functions, operations, or steps, any of these embodiments mayinclude any combination or permutation of any of the components,elements, features, functions, operations, or steps described orillustrated anywhere herein that a person having ordinary skill in theart would comprehend. Furthermore, reference in the appended claims toan apparatus or system or a component of an apparatus or system beingadapted to, arranged to, capable of, configured to, enabled to, operableto, or operative to perform a particular function encompasses thatapparatus, system, component, whether or not it or that particularfunction is activated, turned on, or unlocked, as long as thatapparatus, system, or component is so adapted, arranged, capable,configured, enabled, operable, or operative. Additionally, although thisdisclosure describes or illustrates particular embodiments as providingparticular advantages, particular embodiments may provide none, some, orall of these advantages.

1. A method comprising: receiving an audio signal that contains an audiotransmission, the audio transmission containing (i) a predeterminedportion and (ii) data for transmission using the audio transmission;computing a first plurality of similarity measures between (i) a realportion of an expected sequence of complex values and an imaginaryportion of the expected sequence of complex values and (ii) a firstplurality of portions of the audio signal; identifying, from among thefirst plurality of portions of the audio signal, a first portion of theaudio signal with the most similarity to the expected sequence ofcomplex values according to the first plurality of similarity measures;and determining an arrival time of the audio transmission based on astarting time of the first portion of the audio signal.
 2. The method ofclaim 1, wherein identifying the first portion of the audio signalcomprises: determining that a first portion of the audio signal fromamong the first plurality of portions of the audio signal corresponds tothe largest of the first plurality of similarity measures.
 3. The methodof claim 1, wherein the first plurality of portions of the audio signalbegin at a first plurality of times and the first portion of the audiosignal begins at a first time from among the first plurality of times,and wherein determining the arrival time of the audio transmissioncomprises: determining the arrival time of the audio transmission basedon the first time.
 4. The method of claim 1, further comprising, priorto computing the first plurality of similarity measures: computing asecond plurality of similarity measures between (i) the real portion ofthe expected sequence of complex values and (ii) a second plurality ofportions of the audio signal; and identifying, from among the secondplurality of portions of the audio signal, a second portion of the audiosignal with the most similarity to the expected sequence of complexvalues according to the second plurality of similarity measures.
 5. Themethod of claim 4, wherein the first plurality of portions of the audiosignal are selected based on a starting time of the second portion ofthe audio signal.
 6. The method of claim 1, wherein the imaginaryportions of the expected sequence indicate phase differences for samplesof the expected sequence.
 7. The method of claim 6, wherein thepredetermined portion of the audio transmission is initially transmittedto include the real portion of the expected sequence and the phasedifferences indicated by the imaginary portion of the expected sequence.8. The method of claim 1, wherein the expected sequence of complexvalues is a continuous linear phase chirp signal.
 9. The method of claim1, wherein the first plurality of similarity measures are calculated ascorrelation measures.
 10. The method of claim 9, wherein the correlationmeasures include convolution measures.
 11. The method of claim 1,wherein the first plurality of portions of the audio signal and thesecond plurality of portions of the audio signal contain onlyreal-valued audio signals.
 12. A system comprising: a processor; and amemory storing instructions which, when executed by the processor, causethe processor to: receive an audio signal that contains an audiotransmission, the audio transmission containing (i) a predeterminedportion and (ii) data for transmission using the audio transmission;compute a first plurality of similarity measures between (i) a realportion of an expected sequence of complex values and an imaginaryportion of the expected sequence of complex values and (ii) a firstplurality of portions of the audio signal; identify, from among thefirst plurality of portions of the audio signal, a first portion of theaudio signal with the most similarity to the expected sequence ofcomplex values according to the first plurality of similarity measures;and determine an arrival time of the audio transmission based on astarting time of the first portion of the audio signal.
 13. The systemof claim 12, wherein the instructions further cause the processor, whileidentifying the first portion of the audio signal, to: determine that afirst portion of the audio signal from among the first plurality ofportions of the audio signal corresponds to the largest of the firstplurality of similarity measures.
 14. The system of claim 12, whereinthe first plurality of portions of the audio signal begin at a firstplurality of times and the first portion of the audio signal begins at afirst time from among the first plurality of times, and wherein theinstructions further cause the processor, while determining the arrivaltime of the audio transmission, to: determine the arrival time of theaudio transmission based on the first time.
 15. The system of claim 12,the instructions further cause the processor, prior to computing thefirst plurality of similarity measures, to: compute a second pluralityof similarity measures between (i) the real portion of the expectedsequence of complex values and (ii) a second plurality of portions ofthe audio signal; and identify, from among the second plurality ofportions of the audio signal, a second portion of the audio signal withthe most similarity to the expected sequence of complex values accordingto the second plurality of similarity measures.
 16. The system of claim15, wherein the first plurality of portions of the audio signal areselected based on a starting time of the second portion of the audiosignal.
 17. The system of claim 12, wherein the imaginary portions ofthe expected sequence indicate phase differences for samples of theexpected sequence.
 18. The system of claim 17, wherein the predeterminedportion of the audio transmission is initially transmitted to includethe real portion of the expected sequence and the phase differencesindicated by the imaginary portion of the expected sequence.
 19. Thesystem of claim 12, wherein the expected sequence of complex values is acontinuous linear phase chirp signal.
 20. The system of claim 12,wherein the at least one of the first plurality of similarity measuresand the second plurality of similarity measures are calculated ascorrelation measures.